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effective information for land use management. The use of Sentinel-2 
imagery is considered to be able to provide better information on land 
cover because it has a spatial accuracy of 10 meters. Convolutional Neural 
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(https://creativecommons.org/licenses/by/4.0/). | Convolutional Neural Network method which gives special treatment to 
the dimension reduction process in the input data. The dimension reduction 
process is carried out by utilizing the PCA method so that the data 
processing process becomes faster without losing important information so 
that better method performance is obtained. The PCA-CNN method is 
implemented on a dataset of the Situbondo district which is classified into 
five land cover classes. The results of the PCA-CNN method have an 
Overall Accuracy of 94.4% and Kappa Indeks 0,92 with 100 epochs of 
repeated experiments. 


Keywords— Land Cover, Sentinel-2, Deep 
Learning, PCA, CNN. 


I. INTRODUCTION multispectral and has 13 bands obtained from the 
multispectral imager [11]. Automation methods for 
processing Sentinel-2 satellite imagery include the use of 


The large area and the mapping of the Situbondo area 
that has not been mapped properly are separate obstacles 
in the process of developing and planning the area. 
Automation of land cover monitoring and classification is data that aims to create a multilevel data representation 
required to monitor existing land use. The technology [1]. The most important thing about deep learning 
needed to analyze the earth's land cover automatically and emphasizes that the data representation is not made 


cover a large area is by utilizing geospatial data in the explicitly by humans but is generated by an algorithm [5]. 
According to Heryadi and [5] in the last ten years the 


application of deep learning shows that models based on 
Convolutional Neural Networks (CNN) with deep 
structures have excellent performance in the field of 


deep learning. Deep learning is a learning method for 


form of satellite image data. One of the satellite images 
that can be used is the Sentinel-2. Sentinel-2 imagery is 
an image generated from remote sensing by the Sentinel- 
2 satellite. The Sentinel-2 satellite is equipped with a 
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pattern processing, such as object classification in 
images. CNN or ConvNet is a deep feed-forward 
artificial neural network that is widely applied in image 
analysis. CNN consists of one input layer (input layer), 
one output layer (output layer), and a number of hidden 
layers [10]. 


II. METHODOLOGY 
2.1 Principal Component Analysis (PCA) 


Dimensional reduction is a process carried out to 
simplify the existing variables to be fewer without losing 
the information contained in the initial data. One of the 
methods used in dimension reduction is Principal 
Component Analysis (PCA). The workings of PCA is to 
change the initial variable as many as n variables are 
reduced to k new variables called Principal Component 
(PC). Sum The number of k is less than n but by using a 
number of k(PC) can produce a value that is close to the 
same using n variables. PC that is formed is a linear 
combination of the initial variables that are independent or 
not correlated with PC other. The following are the steps 
to perform dimension reduction using PCA: 


1. Compile the input matrix X as one of the k attribute 
vector data Xij where i = 1,2,...,n and j = 1,2,...,m. 


X11 X12 Xim 

X21 %22 X2m 
X= i 

Xn1 Xn2 * Xam 


2. Calculating the mean X =X which statisfies the 
following equation 


— 1y™ 
X ==) Xi 
n i=1 


3. Calculating the covariance matrix C which satisfies 
the following equation 


1 = = 
C = — (X -X(x - x)" 
n—1 
4. Calculating the eigen values A which satisfies the 
following eguation 
Ic — All — 0 
5. Calculating the eigen vector v which satisfies the 
following equation 
Ic —Al][v] = 0 


6. Extract the diagonal values from the eigen values and 
sort them in descending. 


7. Here are some ways to determine I column eigen 
vector to be selected as PC. 


a. Using a scree plot of the proportion of variance, 
based on the point of the curve that no longer 
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decreases sharply and generally shows PC with 
eigen values of more than 1. 


b. Using the cumulative proportion of variance 
which is formulate as follows 
PC, = Dis hi x 100% 
Ea 
with A, > 4» > ++: > Ap. The number PCs has at 
least a cumulative proportion of variance of 80% 
[8]. 
8. The new variable resulting from the reduction is 
obtained by using an eigen vector matrix with an 


input. 
PC, = ejX' = e,,X) He, X5 +epıXp 
PC, = e,X' = e Xi + e22% +ep2Xp 
PCy, = epX' = e1pX1 + €2pX2 ... +eppXp 


2.2 Convolutional Neural Networks (CNN) 
Convolutional Neural Networks (CNN) or ConvNet is a 
deep feed-forward artificial neural network that is widely 
applied in image analysis. CNN consists of an input layer 
(input layer), an output layer (output layer), and a number 
of hidden layers (hidden layer). Hidden layers generally 
contain convolutional layers, pooling layers, normalization 
layers, ReLu layers, full connected layers, and loss layers. 
All the layers are arranged in a pile. CNN uses a three- 
dimensional architecture, namely width, height, and depth. 
The width and height dimensions on CNN are 
representations of the image (texture and morphology) 
while the inner dimensions represent color channels [11]. 
The following is the architecture of CNN can be seen in 
Figure 1 [1]. 


Features extraction 


Fig.I. CNN Architecture 


2.3 Sentinel-2 

The Sentinel-2 satellite is a European optical imaging 
satellite that was first launched in 2015 which was 
launched as the Europe Space Agency (ESA) Copernicus 
program. The Sentinel-2 satellite has 13 spectral bands 
carrying various swaths of high-resolution multispectral 
imager. The Sentinel-2 satellite system is often referred to 
as a twin satellite, namely Sentinel-2A (S2A) and Sentinel- 
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2B (S2B) because it works in sync so that it looks like one 
satellite. Each satellite has a revisit frequency (temporal 
resolution) every 10 days. Sentinel-2A and Sentinel-2B 
satellites have a revisit time offset of 5 days (phase shift 
180°), so that the same location on the earth's surface will 
be recorded by Sentinel-2A (S2A) and Sentinel-2B (S2B) 
every 5 days alternately. The Sentinel-2 satellite has 
several sensors, including Visible and Near Infrared 
(VNIR) and Near Infrared (NIR) to Short Wafe Infrared 
(SWIR). The Sentinel-2 satellite can be used for 
supporting services such as forest monitoring, land cover 
change detection and natural disaster management [2]. 


2.4 Evaluation of the model 


The evaluation of the model in this study was carried out 
based on accuracy tests performed using a confusion 
matrix to determine the producer's accuracy,user accuracy, 
overall accuracy and kappa index. Producer's accuracy is 
the accuracy seen from the side of the map producer, while 
user accuracy is the accuracy seen from the side of the map 
user. Overall accuracy is the model's accuracy value, while 
the kappa index is a measure that states the consistency 
between two measurement tools or methods. 
Mathematically it can be seen in Table 1. 


Table 1. Size of Classification Evaluation Model 


No Ukuran Rumus 
' Xii 
1. Producer's * 100% 
Accuracy +j 
2. User Ku 100% 
Accuracy Xis 
3, Overall 1 Xi 
Ai=1 ^ii 0 
Accuracy Xmn ai 
n yg. 
4. Indeks Dia Ku YEN TOT 
Kappa mn 100% 


1- Yap XiX+j 


Where X;; is the diagonal value of the i-th row and i-th 
column matrix. X,; is the number of pixels in the j-th 
column, X;, is the number of pixels in the i-th and X,,, is 
the number of pixels in the example. The following is a 
description of the confusion matrix as illustrated in Figure 
2. 
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___ Actual Class 
1 | 2 | 83 it 
lj Ku Xiz Xiz 
Prediction| 2j Xa2 Xo3 


Class 


Fig.2. Confusion Matrix 


According to [8] the following is a suitability category 
between the two tools or methods of measuring the kappa 
index. as shown in Table 2. 


Table 2. Strength Of Kappa Index 


Kappa Index (%) (Strength of Agreement) 
<0,20 Poor 
0,21 — 0,40 Fair 
0,41 — 0,60 Moderate 
0,61 — 0,80 Strong 
0,81 — 0,99 Very strong 


III. RESEARCH 
3.1 Study area and data source 


The research was conducted in January — July 2022. 
The research area covers part of Situbondo Regency. Data 
collection was carried out based on the Sentinel-2 satellite 
image from the https://scihub.copernicus.eu/. The tools and 


materials used in this study are a laptop with specifications 
Intel® Core™ 15-3337U CPU @ 1.80GHz, 8.00 GB 
RAM, NVIDIA GeForce GT720M with 2GB VRAM and 
64-bit OS.Software ESA SNAP8.0 used for preprocessing 
dataGoogle Colab Software is used for the data 
classification process. Sentinel-2 data used in this study is 
part of the Situbondo district, East Java province. Image 
data was taken on July 14, 2021 at 02:25:41 GMT. The 
following is a Sentinel-2 image format that was 
successfully downloaded “S2A MSIL2A 20210714 T 
022551 N0301 RO46T49MHM 20210714 T070327”. 


3.2 Model Input Variables and Parameters PCA-CNN 
PCA-CNN. 


Modeling on satellite imagery for land cover analysis in 
Situbondo Regency has several stages. The first stage is 
the determination of parameters. The parameters used in 
the PCA-CNN model include the determination of the 
number of convolutional layers, the selection of the 
pooling and the activation function. Parameters on the 
PCA-CNN model can be seen in appendix 4. The second 
step is to determine the batch_size and the number of 
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iterations (epochs) on the model to be run. The PCA-CNN 
model uses batch_size = 20 and the number of iterations 
(epochs) = 100. A total of 1000 images are used as training 
data for each class and 500 images are used as testing data 
for each class. 


3.3 Classification Result and Visual Assessment 


The following are the results of the classification process 
using the PCA-CNN model which are presented in the 
“Training and test accuracy” graph and the “Training and 
test loss” graph can be seen in Figure 3.a and Figure 3.b. 


Training dan test accuracy 


0 20 40 60 80 100 


Fig.3.a Graph of “Training and Test Accuracy” 


Training dan test loss 


0 20 40 60 80 100 


Fig.3.a Graph of “Training and Test Loss” 


Seen from graph 3a. The blue line shows the accuracy of 
the training. The results that show an increase in accuracy 
in each iteration indicate that the model runs well at the 
training so that the accuracy results are stable and 
continue to increase. Different things are shown in the 
orange line which shows the accuracy of the test results. 
The results obtained in the test process indicate the value 
of the test accuracy is fluctuating. These results indicate 
that the model experiences heavy learning in each iteration 
of the test results. The test results at the end of the iteration 
show an accuracy value that is not too far from the training 
so that the model can be said not to be overfitting or fail to 
guess the results of the predictions.The results obtained in 
graph 3.a will be eguivalent to the results that occur in 
graph 3.b The results in graph 3.b show the ability of the 
model to make errors in the classification process. If in 
graph 3.a the results show a high accuracy value, then the 
results in graph 3.b will show a loss in the same iteration. 
The detailed results of the PCA-CNN model classification 
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process are shown in the confusin matrix in Figure 4. 


Fig.4. Confusin Matrix Of PCA-CNN Model 


3.4 Classification accuracy assessment 
The model test is carried out using testing originating from 
the distribution of data sets using the hold-out method. The 
model test carried out provides predictive results from the 
PCA-CNN method which can be seen in Table 2. 

Table 2. PCA-CNN Model Prediction Results 


PCA-CNN 
Kelas Producer User Accuracy 
Accuracy (90) (70) 

Kebun 90,5 95,9 
Perumahan 100 93,6 
Pertanian Lahan 

88,54 99,23 
Kering 
Sawah 96,9 87,5 
Tubuh Air 97,5 95,5 

Overall Accuracy (%) Indeks Kappa 
94,4 0,92 


Values from Table 1 are obtained from the confusion 
matrix Figure 4 above. Table 1 shows that the highest 
accuracy value for the prediction of the five land cover 
classes is the Producer Accuracy in the housing class, 
which is 100%. That is, by using the PCA-CNN Producer 
Accuracy on the housing class, each prediction is 
successfully guessed accurately for each existing 
data.Overall Accuracy of the PCA-CNN model has a value 
of 94.4% with a kappa index of 0.92. This value shows the 
results of the model prediction on the test data are very 
good, which is above 80%. 


IV. CONCLUSION 


The PCA-CNN method as a whole can be applied to 
land cover classification using Sentinel-2 imagery with 
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five main classes namely kebun, perumahan, Pertanian 
lahan kering, sawah, and Tubuh Air. The PCA-CNN 
method has the Overall Accuracy of the PCA-CNN model 
which has a value of 94.4% with a kappa index of 0.92. 
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